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Abstract. The chemical composition of ultra high energy cosmic rays is still 
uncertain. The latest results obtained by the Pierre Auger Observatory and the HiRcs 
Collaboration, concerning the measurement of the mean value and the fluctuations of 
the atmospheric depth at which the showers reach the maximum development, X max , 
are inconsistent. From comparison with air shower simulations it can be seen that, 
while the Auger data may be interpreted as a gradual transition to heavy nuclei for 
energies larger than ~ 2 — 3 x 10 18 eV, the HiRes data are consistent with a composition 
dominated by protons. In Ref. 1 it is suggested that a possible explanation of the 
observed deviation of the mean value of X max from the proton expectation, observed by 
Auger, could originate in a statistical bias arising from the approximated exponential 
shape of the X max distribution, combined with the decrease of the number of events 
as a function of primary energy. In this paper we consider a better description of 
the X max distribution and show that the possible bias in the Auger data is at least 
one order of magnitude smaller than the one obtained when assuming an exponential 
distribution. Therefore, we conclude that the deviation of the Auger data from the 
proton expectation is unlikely explained by such statistical effect. 
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1. Introduction 

The nature of the primary cosmic rays is intimately related to the astrophysical objects 
capable of accelerating these particles to such high energies. Also, propagation in the 
intergalactic medium depends on the composition, which affects the resulting spectral 
distribution of the flux observed at Earth. A knowledge of the composition is also very 
important for primary energy reconstruction and for anisotropy studies. 

One of the most important limitations of composition analyses comes from the lack 
of knowledge of the hadronic interactions at the highest energies. Composition studies 
are based on the comparison of experimental data with Monte Carlo simulations of 
atmospheric cosmic rays showers, which makes use of hadronic interaction models which 
extrapolate the available low energy accelerator data to the energies of the cosmic rays. 

One of the most sensitive parameters to the mass of the primary cosmic ray is the 
atmospheric depth at which the showers reach their maximum development. Lighter 
primaries generate showers that are more penetrating, producing larger values of X max . 
Also, the fluctuations of this parameter are smaller for heavier nuclei. The Pierre Auger 
Observatory and the HiRes experiment are able to observe directly the longitudinal 
development of the showers by means of fluorescence telescopes. Therefore, in both 
experiments, the X max parameter of each observed shower can be reconstructed from 
the data taken by the telescopes. 

The mean value and the standard deviation of X max , as a function of primary 
energy, obtained by Auger [2] and HiRes [3] appear to be inconsistent. From the 
comparison with simulations, the Auger data suggest a transition to heavier nuclei 
starting at energies of order of 2 — 3 x 10 18 eV, whereas, the HiRes data are consistent 
with protons in the same energy range. In Ref. [1] a new parameter, the difference 
between the mean value and the standard deviation of X max , was introduced in order to 
reconcile the Auger and HiRes results. This new parameter has the advantage of being 
much less sensitive to the first interaction point than the mean value and the standard 
deviation separately. From a comparison of the experimental values of this parameter, 
obtained by Auger and HiRes, with simulated data, they infer that the composition of 
the cosmic rays is dominated by protons. They say that the energy dependence of the 
distribution of X max , observed by Auger, seems to be caused by an unexpected change 
in the depth of the first interaction point, which can be explained by a rapid increase of 
the cross section and/or increase of the inelasticity. Both possibilities require an abrupt 
onset of new physics in this energy range, which makes them questionable. They also 
suggest that the deviation of the distribution of X max from the proton expectation, 
present in the Auger data, could be originated in the statistical techniques used to 
analyze the data. In particular, they suggest that the deviation of the mean value of 
Xmax from the proton expectation could be explained by a bias originated from the 
exponential nature of the X max distribution and the decreasing number of events as a 
function of primary energy. 

In this work we show that, considering a better description of the X max distribution, 
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the bias in the determination of the mean value of X max become more than one order 
of magnitude smaller than the one obtained for the exponential distribution. We find 
that the value of the bias in the last energy bin (the one with the smallest number of 
events) of the Auger data, published in Ref. [2], is < 1.5 g cm -2 , which is much smaller 
than the systematic errors on the determination of the mean value of X max estimated 
in Ref. 0. 



2. Numerical approach 

Following Ref. [I] let us introduce the parameter, 

«w)=i-^H (i) 

where (X max ) is the mean value of the X max distribution, 

1 N 

x N -= — x { (?) 

max / j max i \ I 

i=l 

is the sample mean corresponding to samples of size N and modefX^J is the value 
of X^ ax that occurs most frequently, i.e. the maximum of the distribution function of 
X^ ax . Therefore, the bias on the determination of (X max ) appears when a particular 
realization of the sample mean is equal to the mode of the sample mean distribution 
function. Note that the sample mean (Eq. ([2])) is an unbiased estimator of the mean 
of the exponential distribution, i.e. E[X^ ax ] = (X max ). In Ref. pG| it is shown that 
approximating the X max distribution by an Exponential function the parameter £(N) 
is given by: £e(N) = 1/N. 

In order to better describe the distribution of X max two different types of functions 
are considered. They are chosen in such a way that the distribution of X^ ax can be 
obtained, at least, in a semi-analytical way. The first function considered is a shifted- 
Gamma distribution [4], 

{Xmax ~ X()) k 1 / X max — Xq 



P G {Xmax) = { T(k) r\ GXP r x ) Xmax - X ° , (3) 

X max < Xq 

where k = 5 and the other two parameters can be obtained from the mean value and 
the standard deviation of X max , 

X = (X max ) - k t x , (4) 

The second function under consideration is the convolution between an exponential 
function and a Gaussian (Exp-Gauss), 

1 [ x ™* ( X max -u\ ( (u-a) 



PEG(Xmax) = : r— n / du exp exp 



1 ( X max — a (3 2 \ ( (3 X max — ( 
_ exp ^ _ + _ j Erfc ^ 
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where a, (3 and A are fitting parameters and 

Erfc(z) = 1 - 4= / dt ex P (-t 2 / 2 ) • ( 7 ) 
V ^ 

A library of simulated showers was generated by using the program CONEX 
(v2r2.3) j5]. Monochromatic samples of 10 4 proton showers were generated from 
\og(E/eV) = 18 to \og(E/eV) = 19.5 in steps of Alog(E/eV) = 0.1. The arrival 
directions of the showers follow an isotropic distribution, such that the zenith angle is 
in the interval [0°, 60°]. The hadronic interaction models considered are QGSJET-II [6] 
and EPOS 1.99 [7]. 

The mean value and the standard deviation (needed for the description of the X max 
distribution using the shifted-Gamma function) were fitted with a quadratic function 
and a linear function of log(.E'), respectively, i.e., 

(X max ) = A + A 1 log(E/eV)+A 2 log 2 (E/eV), (8) 
a\X max ] = B + B 1 \og(E/eV). (9) 

Figure [T] shows the simulated data as well as the fits, for both hadronic interaction 
models considered. The values of the parameters corresponding to Eqs. (EJ) and ((9]) are 
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Figure 1. Mean value (left panel) and the standard deviation (right panel) of X max 
as a function of log(E/eV) obtained by using CONEX with QGSJET-II and EPOS 
1.99 for proton initiated showers. The lines correspond to the fits of the simulated 
data (see the text for details). 



nven in table [TJ 



Table 1. Parameters corresponding to the quadratic and linear fits of (X max ) and 
p[X max ], respectively (see Eqs. © and ©), obtained from simulations for QGSJET-II 
and EPOS 1.99. 





A [g cm 2 ] 


Ai [g cm 2 ] 


A 2 


g cm 2 ] 


B [g cm 2 ] 


B 1 [g cm 2 ] 


QGSJET-II 


-826.171 


124.198 


-2.0 


8037 


113.223 


-2.99273 


EPOS 1.99 


80.4419 


14.1183 


1.27933 


69.4862 


-0.614616 



On the influence of statistics on the determination of (X max ) 



5 



The distribution functions of X max , for every energy and hadronic interaction 
model considered, were fitted with the Exp-Gauss function, Eq. ([6]). The parameters 
a, (3 and A were fitted with linear functions of \og(E), in order to obtain the Exp- 
Gauss representation of the X max distribution for every value of energy in the interval 
[10 18 , 10 19 - 5 ] eV, see |Appendix A| for details. 

Figure [2] shows the distributions of X max , obtained by using CONEX with QJSJET- 
II, for \og(E/eV) = 19 and \og(E/eV) = 19.5. Red solid lines correspond to the fits of 
the simulated data with the Exp-Gauss function. The blue dashed lines correspond to 
the shifted-Gamma function, Eq. fl3]), for which the parameters X Q and Tx are obtained 
by using the expressions of (X max ) and o"[X ma:r ] in Eqs. (JE]) and (G§ to calculate X and 
Tx from Eqs. @ and ([5]), respectively. From the figure it can be seen that the Exp- 
Gauss function is a better fit to the simulated data than the shifted-Gamma function. 
It can also be seen that the tail to larger values of X max is slightly overestimated by the 
Exp-Gauss distribution and underestimated by the shifted-Gamma function. Therefore, 
the distribution function of the universe (samples with iV — > oo) should fall between 
this two functions. 
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Figure 2. Distributions of X max for proton showers generated by using CONEX with 
QGSJET-II. Red solid lines correspond to the fits of the histograms with the Exp- 
Gauss function, Eq. ((6]). The blue dashed lines correspond to the shifted-Gamma 
function, Eq. ((3|), where the parameters X and tx are obtained by using Eqs. ([4}, 
([5]), and the fits of (X max ) and cr[X ma:E ] as a function of \ogE (see text for details). 
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The distribution of X^ ax can be calculated by means of the characteristic function, 
which is defined as the expectation value of exp(itX max ), i.e. <fix max (t) = E[exp(itX max )]. 
It is straightforward to show that the characteristic function of X^ ax is given by 

4>x»jt) = [<t> Xmax {t/N)] N m- 

The characteristic function of the shifted-Gamma distribution is <b% (t) = 
exp(iX t) (1 — itr x )~ k and then the characteristic function of X^ ax is given by 
(p%N (t) = exp(iX t) (1 — Utx I 'N)~ kN , which corresponds also to a shifted-Gamma 
distribution. Therefore, the distribution function of X^ ax is given by, 

{( VN v \Nk—l / vN y \ 

r(Nk)(r x /N)^ eXV { r x /N ) max — ° # (10) 
X max < Xq 

By using Eq. ffTO]) it is easy to show that, 

In this case, ^ is also proportional to 1/N but it is suppressed by the ratio between the 
standard deviation and the mean value of X max . A similar expression is obtained when 
the distribution function of X max is described by a truncated exponential function, see 



Appendix B for details. The blue solid line on the left panel of Fig. [3] corresponds 



to £g as a function of iV for log(E/eV) = 19.5, approximately the mean value of the 
energy (weighted by the spectrum) for the last bin considered in Ref. [ZJ. Note that, 
the number of events in this bin is 34. From the figure, it can be seen that £q is more 
than one order of magnitude smaller than the function 1/N. 

The distribution function of X max is affected by the presence of fluctuations 
introduced by the detectors. The distribution function of X^ ax , including a Gaussian 
uncertainty on the determination of X max is given by, 



mxlx) = -fi- r dX P G (X) exp f- ^ff *H , (12) 

where o~R ec is the standard deviation of such uncertainty. The mode of this distribution 
is calculated numerically. Dashed and dashed-dotted lines on the left panel of Fig. 
[3] correspond to parameter £,g(N) obtained for a Rec = 20 g cm -2 and a Rec = 40 g 
cm -2 , respectively. When a symmetric uncertainty on the determination of X max is 
included, the parameter £ becomes still smaller and decreases for increasing values of 
the uncertainty. This is due to the fact that £ is larger for asymmetric distributions, 
like the exponential, and the convolution of the pure X max distribution with a Gaussian 
is more symmetric than the original one. 

The characteristic function of the Exp-Gauss distribution is the product of the 
characteristic function of the exponential distribution, (1 — iXt)~ l , with the one 
corresponding to a Gaussian, exp(iat — (3 2 t 2 /2). Then, the characteristic function of 

x max is then g iven b Y> 



= ( 1 - V ) ex P ( iat - ^7 2 ) ' ( 13 ) 
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which corresponds to the convolution of a Gamma distribution with a Gaussian, 

N N+l/2 rX« ax _ / V N -v\ 

Peg(.X") = / du (X» - u) N ^ exp - 

/ (u-ct) 2 \ 

xexp (~W~J- (14) 

Last integral is calculated numerically in order to obtain the mode of the resultant 
distribution. The solid red line in the right panel of Fig. [3] shows £eg as a function of 
the sample size for \og(E/eV) = 19.5. Note that £g is smaller than £ EG , this is due to 
the more extended tail to larger values of the Exp-Gauss distribution compared with 
the corresponding one to the shifted-Gamma distribution. In any case, £eg is still about 
one order of magnitude smaller than 1/JV. As for the case of the Gamma distribution, 
dashed and dashed-dotted red lines correspond to <7R ec = 20 g cm -2 and <TR ec = 40 g 
cm -2 , respectively. In this case the effect of the uncertainty on the determination of 
X max is included in P EG just by replacing the parameter f3 by (3 = ^ '/3 2 + o\ ec . As 
expected, the curves that include the uncertainty on the determination of X max fall 
bellow the one corresponding to the ideal case. 




Figure 3. £ as a function of the sample size N corresponding to proton showers of 
\og(E/eV) — 19.5, obtained for the shifted-Gamma distribution (left panel) and for 
the Exp-Gauss distribution (right panel). Blue and red solid lines correspond to the 
ideal case in which X max is determined without any uncertainty. Dashed and dashed- 
dotted lines correspond to the cases in which there is a Gaussian uncertainty on the 
determination of X max of (iR ec = 20 g cm~ 2 and <jft ec = 40 g cm -2 , respectively. The 
hadronic interaction model used is QGSJET-II. 

The left panel of Fig. H] shows the parameter £ as a function of energy corresponding 
to the number of events in each energy bin taken from Ref. 0, for the case in which 
there is no uncertainty on the determination of X max (which gives larger value of £, as 
shown before). The energy assigned to the ith bin, used to calculate £, corresponds to 
the mean value of the energy in the bin weighted by the broken power law fit of the 
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cosmic rays energy spectrum, J(E), of Ref. [9], 



(Ei) 



iff dE EJ{E) 
fff dE J(E) 



(15) 



where Ef and Ef are the lower and upper limits of the ith bin. It can be seen that 
the values of £, obtained by using the Exp-Gauss distribution and the shifted-Gamma 
distribution, are more than one order of magnitude smaller than the corresponding 
one for the exponential distribution, in the whole energy range and for both hadronic 
interaction models considered. As in the previous calculation, £g results are smaller 
than C,eg- I n fact, the £ curve corresponding to the true distribution of X max 
should fall between the curves corresponding to the Exp-Gauss and the shifted-Gamma 
representation of the X max distribution. 
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Figure 4. £ (left panel) and AX max (right panel) as a function of \og(E/eV), for 
the statistics of the Auger data of Ref. [5] . Solid lines correspond to QGSJET-II and 
dashed lines correspond EPOS 1.99. 



The right panel of Fig. H] shows the parameter AX max = (X max ) £ which gives 
the grammage of the shift suffered by (X max ) if X^ ax takes the value of the mode of 
its distribution. It can be seen, that for the last energy bin, the one with 34 events, 
AX max is < 1.5 g cm -2 , which is much smaller than the systematic uncertainties on the 
determination of (X max ) estimated in Ref. [2]. 

The energy bins considered in the analysis of Ref. [2] have a width of A log(E/eV) = 
0.1 in the energy range from E = 10 18 eV to E = 10 19 eV. Between E = 10 19 eV and 
E = 10 19A eV, A log (.Eye V) changes to 0.2 and the last bin corresponds to E > 10 19,4 
eV. Therefore, the number of events per bin decreases in the energy range from E = 10 18 
eV to E — 10 19 eV, it increases from 96 in the bin [10 18,9 , 10 19 ] eV to 138 in the bin 
[10 19 , 10 19 ' 2 ] eV and then, it decreases for the last two bins. This change in the bin width 
generates the structure around E = 10 191 eV seen on the curves of Fig. HI 

Note that ^eg calculated by using EPOS 1.99 is larger than the corresponding one 
for QGSJET-II, this is due to the fact that the X max distributions obtained with EPOS 
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1.99 are more asymmetric (increase faster, coming from small values of X max , and have 
a more extended tail) than the corresponding ones to QGSJET-II. 

Concerning iron showers, it can be seen that £ takes smaller values than the ones for 
protons. This is due to the large suppression of fluctuations in iron showers, the ratio of 
the standard deviation to the mean value of X max is smaller than for protons, producing 
smaller values of £ (see Eq. pip ). In particular, = £g /K where K increases from 
~ 2.3 at E = 10 18 eV to ~ 2.4 at E = 10 19 5 eV for QGSJET-II. 



3. Conclusions 



In this work we studied in detail statistical bias in the determination of the mean value 
of X max , suggested in Ref. [1], as a possible explanation of the deviation of Auger data 
from the proton expectation. We used two different functions to fit the X max distribution 
obtained from simulations: (i) the convolution of an Exponential distribution with a 
Gaussian and (ii) a shifted-Gamma distribution. We find that the bias obtained by using 
these two functions is more than one order of magnitude smaller than the corresponding 
one of the Exponential distribution, the one used in Ref. [I]. We find that the values of 
the bias, obtained for the convolution of the Exponential function with the Gaussian, 
are larger because it presents a more extended tail to larger values of X max than the 
shifted-Gamma distribution. We also find that the bias diminishes when a Gaussian 
(symmetric) uncertainty on the determination of X max is included. 

We also calculated the expected bias, as a function of primary energy, using the 
actual number of events in each energy bin of the Auger data, published in Ref. [2], for 
both hadronic interaction models considered in this work, QGSJET-II and EPOS 1.99. 
We find that the largest value of the bias, corresponding to the bin with the smallest 
number of events, is smaller than 1.5 g cm -2 , much less than the systematic errors on 
the determination of (X max ) estimated in Ref. [2]. 



Appendix A. Parameters for the Exp-Gauss fits 



The parameters a, (3 and A, obtained from the fits of the X max distributions with the 
Exp-Gauss function (see Eq. (jSJ)), are fitted with linear functions of \og(E/eV) as shown 
in figure IA1I They can be written in the following way, 





( a(E) \ 






c 2 \ 




P(E) 




C 3 


c 4 




\ KE) J 




\ c 5 


c 6 J 


where the coefficients C i: i — 


1...6, 


are g 



\og(E/eV) 



(A.l) 



Tlfor both hadronic interaction 



models considered. 
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Figure Al. Parameters a, (3 and A corresponding to the fits of the X. max distribution 
with the Exp-Gauss function for QGSJET-II and EPOS 1.99. The straight lines 
correspond to the linear fits of the points. 



Table Al. Coefficients C h in [g cm" 2 ], corresponding to QGSJET-II and EPOS 1.99. 





Ci 


c 2 


c 3 


c 4 


c 5 


c 6 


QGSJET-II 
EPOS 1.99 


-239.053 
-443.120 


51.0096 
63.0078 


-18.0164 
13.6458 


2.23981 
0.320823 


138.806 
75.2960 


-4.57508 
-0.987241 



Appendix B. Calculation of £ for a truncated exponential distribution 

It is possible to describe the X max distribution function by a truncated exponential 
distribution, which is given by, 

{1 I X max — X c \ 
A eXP ( —) Xm "- Xc , (B.l) 
X max <c X c 

where A is a parameter that describe the tail of the X max distribution and X c is the 
truncation value. 

The characteristic function of this distribution is, <P'x I ^ iax {t) = exp(itX c ) (1 — ztA) -1 
and then, the characteristic function of the sample mean is given by, (t) = 
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exp(itX c ) (1 — itA/N) N , which corresponds to a shifted Gamma distribution. 
Therefore, the distribution function of the sample mean is given by, 



JV-l 



Pte{X, 



N n 
max i 



(X max X c ) 

T(N)(A/N) N 



exp 



X-max X c 



A/N 







vN \ y 

^max — -**-c 2^ 

YN ^ V 
^max - 



By using that (X n 
Zte(N) = 



= X c + A, it is easy to show that, 
A 1 
A + X c N' 
o~[X max ] 1 
N' 



(B.3) 
(B.4) 



Note that, it can be seen, form Eq. (lB.4j) . that £te takes a very similar form to the one 
obtained for the shifted-Gamma function, see Eq. (11 II) . 

Typical values of the parameters, obtained experimentally, are X c = 700 g cm" 2 
and A = 56 g cm -2 (note that these parameters depend on primary energy and the 
ones used here, obtained from Ref. [10], correspond to the energy interval [10 18 , 10 18 5 ] 
eV, in any case, they are just used to roughly estimate the suppression factor of the 
bias). Therefore, £,te{N) = 0.125/N, which is suppressed by a factor 0.125 with respect 
to the corresponding one to the exponential distribution. 
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